Learning Discriminative Visual N-grams from Mid-level Image Features

نویسندگان

  • Raj Kumar Gupta
  • Megha Pandey
  • Alex Yong Sang Chia
چکیده

Mid-level image features have been shown to be helpful to bridge the semantic gap between low-level and high-level image representations. Many existing methods to learn mid-level visual elements consider each mid-level feature individually, and do not take their mutual relationships into account. We follow the intuitive idea that learning discriminative combinations of visual elements can help us deal with ambiguities better, and propose the concept of visual n-grams to effectively represent combinations of visual elements along with their relative spatial configuration and co-occurrence relationships. An overview of our approach is shown in Figure 1. Figure 1 (a) shows the process of learning discriminative visual n-grams based on relative spatial position, orientation and co-occurrence relationships of mid-level image patches. Figure 1 (b) further shows how these visual n-grams are used to finally learn a feature vector representing test and training images.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Palarimetric Synthetic Aperture Radar Image Classification using Bag of Visual Words Algorithm

Land cover is defined as the physical material of the surface of the earth, including different vegetation covers, bare soil, water surface, various urban areas, etc. Land cover and its changes are very important and influential on the Earth and life of living organisms, especially human beings. Land cover change monitoring is important for protecting the ecosystem, forests, farmland, open spac...

متن کامل

Unsupervised Discovery of Mid-Level Discriminative Patches

The goal of this paper is to discover a set of discriminative patches which can serve as a fully unsupervised mid-level visual representation. The desired patches need to satisfy two requirements: 1) to be representative, they need to occur frequently enough in the visual world; 2) to be discriminative, they need to be different enough from the rest of the visual world. The patches could corres...

متن کامل

Recognition of Visual Events using Spatio-Temporal Information of the Video Signal

Recognition of visual events as a video analysis task has become popular in machine learning community. While the traditional approaches for detection of video events have been used for a long time, the recently evolved deep learning based methods have revolutionized this area. They have enabled event recognition systems to achieve detection rates which were not reachable by traditional approac...

متن کامل

Ensemble of Part Detectors for Simultaneous Classification and Localization

Part-based representation has been proven to be effective for a variety of visual applications. However, automatic discovery of discriminative parts without object / part-level annotations is challenging. This paper proposes a discriminative mid-level representation paradigm based on the responses of a collection of part detectors, which only requires the imagelevel labels. Towards this goal, w...

متن کامل

Mid-level Representation for Visual Recognition

Visual Recognition is one of the fundamental challenges in AI, where the goal is to understand the semantics of visual data. Employing mid-level representation, in particular, shifted the paradigm in visual recognition. The mid-level image/video representation involves discovering and training a set of mid-level visual patterns (e.g., parts and attributes) and represent a given image/video util...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015